Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support repo's Appstream data download and install #1844

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mcrha
Copy link
Contributor

@mcrha mcrha commented Nov 8, 2024

Repositories can provide Appstream data for the packages they contain. This Appstream data is consumed by applications like gnome-software or KDE Discover, thus the users can see the packages (apps) in them.

This is to be in pair with PackageKit, which does download and install the repo's Appstream data.

As this adds a dependency on the appstream library, there is also a CMake option WITH_APPSTREAM to be able to turn the support off. The support is enabled by default.


That is, the /var/cache/libdnf5/$REPO_NAME/repodata/ directory can have downloaded also files with appstream in their name and, when installed, the data is stored under /var/cache/swcatalog/xml/.

CC @m-blaha @jan-kolarik

@mcrha
Copy link
Contributor Author

mcrha commented Nov 8, 2024

Hrm, the CI either needs to add appstream-devel (or better pkgconfig(appstream)>=1.0) dependency or turn off build of these bits with -DWITH_APPSTREAM=OFF to the cmake command. No idea how to do it, but if you could guide me I'll change it either way.

libdnf5/repo/repo.cpp Outdated Show resolved Hide resolved
@Conan-Kudo
Copy link
Member

We do not want AppStream data being downloaded by default, as it's quite huge, so only callers that can do something with that data should request it.

@mcrha
Copy link
Contributor Author

mcrha commented Nov 13, 2024

We do not want AppStream data being downloaded by default, as it's quite huge, so only callers that can do something with that data should request it.

It's not that you (the caller) only can work with it, the Appstream data is installed system wide (to /var/cache/swcatalog/...), where any Appstream-capable app can use it. Preparing it beforehand makes sense, no?

Or do you mean, in the "we" distro, when there's no gnome-software nor KDE Discover, there's also no PacakgeKit (which downloads and installs the appstream data unconditionally)? Note these are only the GUI apps I know of, which consume the appstream data. The appstreamcli reads it too, to name one on the command line.

@mcrha
Copy link
Contributor Author

mcrha commented Nov 13, 2024

Hrm, the CI...

I looked around, and please correct me if I'm wrong, because I can be wrong, it seems to me the CI is located in a separate project, in the https://github.com/rpm-software-management/ci-dnf-stack . I looked briefly into it, but I realized it's a blackbox for me, I even do not see the build parameters to be applied there. I'm sorry.

@Conan-Kudo
Copy link
Member

We do not want AppStream data being downloaded by default, as it's quite huge, so only callers that can do something with that data should request it.

It's not that you (the caller) only can work with it, the Appstream data is installed system wide (to /var/cache/swcatalog/...), where any Appstream-capable app can use it. Preparing it beforehand makes sense, no?

Or do you mean, in the "we" distro, when there's no gnome-software nor KDE Discover, there's also no PacakgeKit (which downloads and installs the appstream data unconditionally)? Note these are only the GUI apps I know of, which consume the appstream data. The appstreamcli reads it too, to name one on the command line.

Right, If PackageKit-DNF5 or GNOME Software through dnf5daemon want it, they can ask for it to be downloaded. But the regular dnf5 and dnf5daemon CLI can't do anything with it, so it's not useful to download.

Repositories can provide Appstream data for the packages they contain.
This Appstream data is consumed by applications like the GNOME Software
or KDE Discover, thus the users can see the packages (apps) in them.

This is to be in pair with PackageKit, which does download and install
the repo's Appstream data.

As this adds a dependency on the `appstream` library, there is also
a CMake option WITH_APPSTREAM to be able to turn the support off.
The support is enabled by default.

Closes rpm-software-management#1564
@mcrha
Copy link
Contributor Author

mcrha commented Nov 13, 2024

While I agree, I also do not think it would be helpful. It's the same as if you would want from the PackageKit to not download and install the appstream data from the repository metadata - it cannot be done. This is better than PackageKit, because with dnf and PackageKit on one machine you've duplicated the repo data in the local cache, which is bad (no PackageKit => half repo data stored on the machine).

Could you point me to a repo, which provides that large appstream data, please? I've been in an impression that the distros provide its appstream data in certain packages, not as the repo metadata. Fedora has appstream-data package, RPMFusion has rpmfusion-*-appstream-data (rpmfusion-free-appstream-data and so on). I do not know what package OpenSUSE uses, but for example the GNOME repo there does not provide appstream data as the repo metadata, while that repository contains GUI apps, which do provide their .metainfo.xml files. For example RPMFusion's Steam repo provides the Appstream data, see it here.

Is it possible we are talking about different things?

@mcrha
Copy link
Contributor Author

mcrha commented Nov 13, 2024

If the download & install of the repo's appstream metadata is a per-session option, then the refresh of the repository needs to run differently, because when something updates the repo without the appstream data and then another app will need the appstream data, then a simple "is repo up-to-date" check is not simple anymore, one would need two checks, one for the repo and one for the appstream data. Handling of a half-updated repo sounds complicated on its own, the more for the maintenance of the code.

@kontura
Copy link
Contributor

kontura commented Nov 14, 2024

I agree with @Conan-Kudo, I really don't think we should download it by default.

If the download & install of the repo's appstream metadata is a per-session option, then the refresh of the repository needs to run differently, because when something updates the repo without the appstream data and then another app will need the appstream data, then a simple "is repo up-to-date" check is not simple anymore, one would need two checks, one for the repo and one for the appstream data. Handling of a half-updated repo sounds complicated on its own, the more for the maintenance of the code.

I don't think this would be a problem, by default we already download just primary, comps and updateinfo everything else is downloaded only when needed. Typically in different sessions (different dnf invocations) like changelogs (other) or filelists.

In my option this could fit well into the optional_metadata_types config option, then any client can configure it. It could even be set by a distribution in dnf config file.

@kontura
Copy link
Contributor

kontura commented Nov 14, 2024

Hrm, the CI...

I looked around, and please correct me if I'm wrong, because I can be wrong, it seems to me the CI is located in a separate project, in the https://github.com/rpm-software-management/ci-dnf-stack . I looked briefly into it, but I realized it's a blackbox for me, I even do not see the build parameters to be applied there. I'm sorry.

Yes, it is located in ci-dnf-stack but the build is done according to the dnf5.spec file in this repository. You would need to add the dependency there.

Though I think we should discuss if it should be a dependency of libdnf, that is if the install should be done by libdnf. This is related to the other comments but since libdnf is not using the metadata it would make more sense to me if the install was done by the client that requested it.

@mcrha
Copy link
Contributor Author

mcrha commented Nov 14, 2024

This is related to the other comments but since libdnf is not using the metadata it would make more sense to me if the install was done by the client that requested it.

Okay.

In that case I'd need access to the metadata itself from the client (through the dnf5daemon). Either to the repomd.xml filenames for each repository (doing N D-Bus calls, one for each of the N configured and enabled repos, sounds boring and waste of time and resources to me, thus a single "give me repomd filenames for each configured & enabled repo" would be better), with another API to get URL of the repo metadata (which cannot be done in bulk then), to avoid code duplication and avoid expectations which can change in the future, like about where the files are stored and such, which the libdnf knows, but the client having only the repomd.xml does not. Then the client would be able to download the data from the provided URL, though you do more than just the download, you also verify the content, right?

Alternatively, a new API to list provided metadata by respective repo, then a new API to download chosen types, which would result in a list of full path names for the places where it had been downloaded. Ideally allow download in bulk, a list of the types to be downloaded resulting in a list of the filenames.

Alternatively, anything you think would be better and would work for you.

Let me know which it is, please, and I can try to update the merge request accordingly, or pass it to someone whom knows the internals of the dnf5 better than me (I know basically nothing). Thanks in advance.

@kontura
Copy link
Contributor

kontura commented Nov 14, 2024

Alternatively, anything you think would be better and would work for you.

I was thinking the client would just add something like "appstream" to the optional_metadata_types config option before metadata sync, libdnf would download/sync all the requested metadata as usual and then the client would call libdnf5::repo::Repo::get_metadata_path("appstream") or some daemon equivalent to get the actual file path(s).

This would probably needed additional work to handle the fact that its multiple files with unknown type with prefix appstream/appdata.

This is just my idea, I will bring it up on our team meeting today.

@mcrha
Copy link
Contributor Author

mcrha commented Nov 14, 2024

That would work for me. It's close to the second suggestion, but not exactly the same.

@mcrha
Copy link
Contributor Author

mcrha commented Nov 14, 2024

This would probably needed additional work to handle the fact that its multiple files with unknown type with prefix appstream/appdata.

Maybe, to not need from the libdnf to decide whether it's prefix/suffix/wildcard/regex/..., to just list downloaded (or all) metadata types from the repomd.xml file, which the caller can "filter" as it needs to and then it'll ask for the paths to those interesting to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants